Dataset statistics
| Number of variables | 33 |
|---|---|
| Number of observations | 15407 |
| Missing cells | 71345 |
| Missing cells (%) | 14.0% |
| Duplicate rows | 2 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 3.9 MiB |
| Average record size in memory | 264.0 B |
Variable types
| CAT | 22 |
|---|---|
| NUM | 11 |
Reproduction
| Analysis started | 2020-06-27 02:28:18.785287 |
|---|---|
| Analysis finished | 2020-06-27 02:28:38.962047 |
| Duration | 20.18 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
| Dataset has 2 (< 0.1%) duplicate rows | Duplicates |
atty_firm_name has a high cardinality: 453 distinct values | High cardinality |
detail_cause has a high cardinality: 63 distinct values | High cardinality |
how_injury_occur has a high cardinality: 15344 distinct values | High cardinality |
injury_city has a high cardinality: 1340 distinct values | High cardinality |
injury_postal has a high cardinality: 1841 distinct values | High cardinality |
injury_state has a high cardinality: 53 distinct values | High cardinality |
detail_cause is highly correlated with cause | High correlation |
cause is highly correlated with detail_cause | High correlation |
osha_injury_type is highly correlated with nature_injury | High correlation |
nature_injury is highly correlated with osha_injury_type and 1 other fields | High correlation |
type_loss is highly correlated with nature_injury | High correlation |
ave_wkly_wage has 9535 (61.9%) missing values | Missing |
claimant_age has 2163 (14.0%) missing values | Missing |
atty_firm_name has 12410 (80.5%) missing values | Missing |
marital_status has 12693 (82.4%) missing values | Missing |
depart_code has 8123 (52.7%) missing values | Missing |
injury_postal has 3684 (23.9%) missing values | Missing |
#dependents has 14808 (96.1%) missing values | Missing |
severity_index has 344 (2.2%) missing values | Missing |
reforms_dummy has 6313 (41.0%) missing values | Missing |
length_employed has 901 (5.8%) missing values | Missing |
Dependent is highly skewed (γ1 = 27.90114494) | Skewed |
how_injury_occur is uniformly distributed | Uniform |
Dependent has 2689 (17.5%) zeros | Zeros |
time_injury has 2828 (18.4%) zeros | Zeros |
diff_carrier_employer has 3225 (20.9%) zeros | Zeros |
diff_employer_injury has 11196 (72.7%) zeros | Zeros |
| Distinct count | 5024 |
|---|---|
| Unique (%) | 32.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10285.41370805478 |
|---|---|
| Minimum | 0.0 |
| Maximum | 3774290.0 |
| Zeros | 2689 |
| Zeros (%) | 17.5% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 152 |
| median | 446 |
| Q3 | 1703.5 |
| 95-th percentile | 51491.9 |
| Maximum | 3774290 |
| Range | 3774290 |
| Interquartile range (IQR) | 1551.5 |
Descriptive statistics
| Standard deviation | 54196.59903 |
|---|---|
| Coefficient of variation (CV) | 5.269267778 |
| Kurtosis | 1591.997702 |
| Mean | 10285.41371 |
| Median Absolute Deviation (MAD) | 446 |
| Skewness | 27.90114494 |
| Sum | 158467369 |
| Variance | 2937271346 |
| Value | Count | Frequency (%) | |
| 0 | 2689 | 17.5% | |
| 154 | 42 | 0.3% | |
| 150 | 32 | 0.2% | |
| 222 | 28 | 0.2% | |
| 215 | 27 | 0.2% | |
| 180 | 27 | 0.2% | |
| 3 | 24 | 0.2% | |
| 232 | 24 | 0.2% | |
| 199 | 23 | 0.1% | |
| 167 | 23 | 0.1% | |
| Other values (5014) | 12468 | 80.9% |
| Value | Count | Frequency (%) | |
| 0 | 2689 | 17.5% | |
| 1 | 2 | < 0.1% | |
| 2 | 2 | < 0.1% | |
| 3 | 24 | 0.2% | |
| 4 | 9 | 0.1% |
| Value | Count | Frequency (%) | |
| 3774290 | 1 | < 0.1% | |
| 1159631 | 1 | < 0.1% | |
| 1086522 | 1 | < 0.1% | |
| 1083067 | 1 | < 0.1% | |
| 1076883 | 1 | < 0.1% |
| Distinct count | 1966 |
|---|---|
| Unique (%) | 33.5% |
| Missing | 9535 |
| Missing (%) | 61.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1148.0342302452316 |
|---|---|
| Minimum | 2.0 |
| Maximum | 9999.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 150 |
| Q1 | 500 |
| median | 1000 |
| Q3 | 1529.25 |
| 95-th percentile | 2669 |
| Maximum | 9999 |
| Range | 9997 |
| Interquartile range (IQR) | 1029.25 |
Descriptive statistics
| Standard deviation | 920.9787874 |
|---|---|
| Coefficient of variation (CV) | 0.8022224104 |
| Kurtosis | 13.36293414 |
| Mean | 1148.03423 |
| Median Absolute Deviation (MAD) | 500 |
| Skewness | 2.590677501 |
| Sum | 6741257 |
| Variance | 848201.9268 |
| Value | Count | Frequency (%) | |
| 500 | 383 | 2.5% | |
| 320 | 204 | 1.3% | |
| 1000 | 180 | 1.2% | |
| 600 | 149 | 1.0% | |
| 150 | 126 | 0.8% | |
| 100 | 121 | 0.8% | |
| 400 | 84 | 0.5% | |
| 1500 | 70 | 0.5% | |
| 1200 | 68 | 0.4% | |
| 300 | 61 | 0.4% | |
| Other values (1956) | 4426 | 28.7% | |
| (Missing) | 9535 | 61.9% |
| Value | Count | Frequency (%) | |
| 2 | 3 | < 0.1% | |
| 3 | 2 | < 0.1% | |
| 5 | 2 | < 0.1% | |
| 7 | 2 | < 0.1% | |
| 8 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9999 | 1 | < 0.1% | |
| 9316 | 1 | < 0.1% | |
| 9200 | 1 | < 0.1% | |
| 9000 | 1 | < 0.1% | |
| 8900 | 1 | < 0.1% |
body_part
Categorical
| Distinct count | 46 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Finger(s) | 1426 |
|---|---|
| Low Back Area | 1317 |
| Knee | 1236 |
| Other Facial Soft Tissue | 1113 |
| Eye(s) | 1028 |
| Other values (41) |
| Value | Count | Frequency (%) | |
| Finger(s) | 1426 | 9.3% | |
| Low Back Area | 1317 | 8.5% | |
| Knee | 1236 | 8.0% | |
| Other Facial Soft Tissue | 1113 | 7.2% | |
| Eye(s) | 1028 | 6.7% | |
| Ankle | 909 | 5.9% | |
| Hand | 816 | 5.3% | |
| Shoulder(s) | 740 | 4.8% | |
| Foot | 617 | 4.0% | |
| Lower Leg | 568 | 3.7% | |
| Other values (36) | 5637 | 36.6% |
Length
| Max length | 54 |
|---|---|
| Median length | 6 |
| Mean length | 10.31161161 |
| Min length | 3 |
| Distinct count | 10 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Strain or Injury By | |
|---|---|
| Fall, Slip or Trip Injury | |
| Struck or Injured By | |
| Miscellaneous Causes | |
| Cut, Puncture, Scrape Injured By | |
| Other values (5) |
| Value | Count | Frequency (%) | |
| Strain or Injury By | 4174 | 27.1% | |
| Fall, Slip or Trip Injury | 2662 | 17.3% | |
| Struck or Injured By | 2533 | 16.4% | |
| Miscellaneous Causes | 2004 | 13.0% | |
| Cut, Puncture, Scrape Injured By | 1878 | 12.2% | |
| Striking Against or Stepping on | 1009 | 6.5% | |
| Caught In, Under or Between | 558 | 3.6% | |
| Motor Vehicle | 358 | 2.3% | |
| Burn or Scald - Heat or Cold Exposure | 211 | 1.4% | |
| Rubbed or Abraded By | 20 | 0.1% |
Length
| Max length | 37 |
|---|---|
| Median length | 20 |
| Mean length | 23.09975985 |
| Min length | 13 |
| Distinct count | 84 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 2163 |
| Missing (%) | 14.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.286922379945636 |
|---|---|
| Minimum | 1.0 |
| Maximum | 91.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 31 |
| median | 40 |
| Q3 | 49 |
| 95-th percentile | 60 |
| Maximum | 91 |
| Range | 90 |
| Interquartile range (IQR) | 18 |
Descriptive statistics
| Standard deviation | 11.82069754 |
|---|---|
| Coefficient of variation (CV) | 0.2934127713 |
| Kurtosis | -0.5246541589 |
| Mean | 40.28692238 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 0.1788198528 |
| Sum | 533560 |
| Variance | 139.7288904 |
| Value | Count | Frequency (%) | |
| 41 | 436 | 2.8% | |
| 43 | 382 | 2.5% | |
| 36 | 378 | 2.5% | |
| 47 | 377 | 2.4% | |
| 46 | 376 | 2.4% | |
| 45 | 369 | 2.4% | |
| 40 | 369 | 2.4% | |
| 39 | 366 | 2.4% | |
| 34 | 364 | 2.4% | |
| 24 | 361 | 2.3% | |
| Other values (74) | 9466 | 61.4% | |
| (Missing) | 2163 | 14.0% |
| Value | Count | Frequency (%) | |
| 1 | 6 | < 0.1% | |
| 2 | 2 | < 0.1% | |
| 3 | 2 | < 0.1% | |
| 4 | 2 | < 0.1% | |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 91 | 1 | < 0.1% | |
| 89 | 1 | < 0.1% | |
| 85 | 1 | < 0.1% | |
| 84 | 1 | < 0.1% | |
| 83 | 2 | < 0.1% |
| Distinct count | 453 |
|---|---|
| Unique (%) | 15.1% |
| Missing | 12410 |
| Missing (%) | 80.5% |
| Memory size | 120.4 KiB |
| TLEVY, STERN & FORD | 116 |
|---|---|
| TJ. LEEDS BARROL, IV ATTORNEY AT LAW | 58 |
| TLEVY, FORD & WALLACH | 51 |
| TLEVY, STERN, & FORD | 50 |
| TCARUSO, SPILLANE, LEIGHTON,CONTRASTANO, ULANER & SAVINO | 37 |
| Other values (448) |
| Value | Count | Frequency (%) | |
| TLEVY, STERN & FORD | 116 | 0.8% | |
| TJ. LEEDS BARROL, IV ATTORNEY AT LAW | 58 | 0.4% | |
| TLEVY, FORD & WALLACH | 51 | 0.3% | |
| TLEVY, STERN, & FORD | 50 | 0.3% | |
| TCARUSO, SPILLANE, LEIGHTON,CONTRASTANO, ULANER & SAVINO | 37 | 0.2% | |
| TLAW OFFICES OF MCNAMARA & | 36 | 0.2% | |
| TMARDER, ESKESEN & NASS | 35 | 0.2% | |
| TKLEIN WAGNER & MORRIS | 34 | 0.2% | |
| TKLEE & WOOLF, LLP | 33 | 0.2% | |
| TLAW OFFICE OF CHRISTINE T. NELSON | 33 | 0.2% | |
| Other values (443) | 2514 | 16.3% | |
| (Missing) | 12410 | 80.5% |
Length
| Max length | 57 |
|---|---|
| Median length | 3 |
| Mean length | 7.532809762 |
| Min length | 3 |
gender
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Male | |
|---|---|
| Female | |
| Uknown | 82 |
| Value | Count | Frequency (%) | |
| Male | 12375 | 80.3% | |
| Female | 2950 | 19.1% | |
| Uknown | 82 | 0.5% |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.39358733 |
| Min length | 4 |
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 12693 |
| Missing (%) | 82.4% |
| Memory size | 120.4 KiB |
| Unmarried, Single, Widowed, Divorced | |
|---|---|
| Married | |
| Separated | 21 |
| Value | Count | Frequency (%) | |
| Unmarried, Single, Widowed, Divorced | 1435 | 9.3% | |
| Married | 1258 | 8.2% | |
| Separated | 21 | 0.1% | |
| (Missing) | 12693 | 82.4% |
Length
| Max length | 36 |
|---|---|
| Median length | 3 |
| Mean length | 6.408385799 |
| Min length | 3 |
claim_st
Categorical
| Distinct count | 49 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| California | |
|---|---|
| New York | 1121 |
| New Mexico | 610 |
| Georgia | 532 |
| Texas | 517 |
| Other values (44) |
| Value | Count | Frequency (%) | |
| California | 8912 | 57.8% | |
| New York | 1121 | 7.3% | |
| New Mexico | 610 | 4.0% | |
| Georgia | 532 | 3.5% | |
| Texas | 517 | 3.4% | |
| North Carolina | 499 | 3.2% | |
| Louisiana | 437 | 2.8% | |
| New Jersey | 341 | 2.2% | |
| Florida | 236 | 1.5% | |
| Illinois | 224 | 1.5% | |
| Other values (39) | 1978 | 12.8% |
Length
| Max length | 25 |
|---|---|
| Median length | 10 |
| Mean length | 9.481469462 |
| Min length | 4 |
| Distinct count | 23 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 8123 |
| Missing (%) | 52.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.119989017023613 |
|---|---|
| Minimum | 1.0 |
| Maximum | 23.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 6 |
| median | 14 |
| Q3 | 18 |
| 95-th percentile | 21 |
| Maximum | 23 |
| Range | 22 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 7.014638483 |
|---|---|
| Coefficient of variation (CV) | 0.5787660758 |
| Kurtosis | -1.477812356 |
| Mean | 12.11998902 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -0.1316306344 |
| Sum | 88282 |
| Variance | 49.20515304 |
| Value | Count | Frequency (%) | |
| 21 | 1164 | 7.6% | |
| 17 | 977 | 6.3% | |
| 8 | 856 | 5.6% | |
| 3 | 693 | 4.5% | |
| 6 | 576 | 3.7% | |
| 2 | 531 | 3.4% | |
| 14 | 467 | 3.0% | |
| 18 | 409 | 2.7% | |
| 11 | 339 | 2.2% | |
| 1 | 232 | 1.5% | |
| Other values (13) | 1040 | 6.8% | |
| (Missing) | 8123 | 52.7% |
| Value | Count | Frequency (%) | |
| 1 | 232 | 1.5% | |
| 2 | 531 | 3.4% | |
| 3 | 693 | 4.5% | |
| 4 | 97 | 0.6% | |
| 5 | 88 | 0.6% |
| Value | Count | Frequency (%) | |
| 23 | 52 | 0.3% | |
| 22 | 121 | 0.8% | |
| 21 | 1164 | 7.6% | |
| 20 | 178 | 1.2% | |
| 19 | 142 | 0.9% |
| Distinct count | 63 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Strain/Injury by Misc | 1577 |
|---|---|
| Strain/Injury by Lifting | 1193 |
| Struck by Falling/Flying Object | 916 |
| Misc, Other | 838 |
| Fall/Slip, Same Level | 795 |
| Other values (58) |
| Value | Count | Frequency (%) | |
| Strain/Injury by Misc | 1577 | 10.2% | |
| Strain/Injury by Lifting | 1193 | 7.7% | |
| Struck by Falling/Flying Object | 916 | 5.9% | |
| Misc, Other | 838 | 5.4% | |
| Fall/Slip, Same Level | 795 | 5.2% | |
| Misc, Foreign Body in Eye | 778 | 5.0% | |
| Cut/Puncture/Scrape, Object Lift/Handled | 751 | 4.9% | |
| Strike/Step On, Fixed Object | 711 | 4.6% | |
| Fall/Slip, Misc | 631 | 4.1% | |
| Fall/Slip, Different Level | 545 | 3.5% | |
| Other values (53) | 6672 | 43.3% |
Length
| Max length | 40 |
|---|---|
| Median length | 25 |
| Mean length | 25.54812747 |
| Min length | 9 |
domestic_foreign
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Domestic | |
|---|---|
| Foreign | 217 |
| Value | Count | Frequency (%) | |
| Domestic | 15190 | 98.6% | |
| Foreign | 217 | 1.4% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.985915493 |
| Min length | 7 |
employ_status
Categorical
| Distinct count | 12 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Unknown/Other | |
|---|---|
| Full-Time | 2440 |
| Seasonal | 190 |
| Part-Time | 97 |
| Piece Worker | 66 |
| Other values (7) | 50 |
| Value | Count | Frequency (%) | |
| Unknown/Other | 12564 | 81.5% | |
| Full-Time | 2440 | 15.8% | |
| Seasonal | 190 | 1.2% | |
| Part-Time | 97 | 0.6% | |
| Piece Worker | 66 | 0.4% | |
| On Strike | 18 | 0.1% | |
| Apprenticeship Full-Time | 14 | 0.1% | |
| Disabled | 9 | 0.1% | |
| Retired | 3 | < 0.1% | |
| Not Employed | 2 | < 0.1% | |
| Other values (2) | 4 | < 0.1% |
Length
| Max length | 24 |
|---|---|
| Median length | 13 |
| Mean length | 12.27740637 |
| Min length | 7 |
handling_office
Categorical
| Distinct count | 28 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| LOS ANGELE | |
|---|---|
| SACRAMENTO | |
| DALLAS WC | 1345 |
| NEW JERSEY | 861 |
| LONG ISLAN | 534 |
| Other values (23) |
| Value | Count | Frequency (%) | |
| LOS ANGELE | 7112 | 46.2% | |
| SACRAMENTO | 1884 | 12.2% | |
| DALLAS WC | 1345 | 8.7% | |
| NEW JERSEY | 861 | 5.6% | |
| LONG ISLAN | 534 | 3.5% | |
| WC SOUTHEA | 522 | 3.4% | |
| CHARLOTTE | 515 | 3.3% | |
| IN-STATE A | 353 | 2.3% | |
| ATLANTA | 297 | 1.9% | |
| ILLINOIS | 281 | 1.8% | |
| Other values (18) | 1703 | 11.1% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.69942234 |
| Min length | 6 |
| Distinct count | 15344 |
|---|---|
| Unique (%) | 99.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| WORKING IN WOODED TALL GRASS AREA - EE WAS BITTEN BY DEER TI | 17 |
|---|---|
| UNKNOWN | 7 |
| EMPLOYEE WAS EXPOSED TO HEPATITIS A VIRUS WHILE WORKING ON S | 7 |
| WHILE WORKING IN A WOODED - TALL GRASS AREA - EE WAS BITTEN | 7 |
| EE WAS EXPOSED TO HEPATITIS A VIRUS WHILE WORKING ON SET. | 5 |
| Other values (15339) |
| Value | Count | Frequency (%) | |
| WORKING IN WOODED TALL GRASS AREA - EE WAS BITTEN BY DEER TI | 17 | 0.1% | |
| UNKNOWN | 7 | < 0.1% | |
| EMPLOYEE WAS EXPOSED TO HEPATITIS A VIRUS WHILE WORKING ON S | 7 | < 0.1% | |
| WHILE WORKING IN A WOODED - TALL GRASS AREA - EE WAS BITTEN | 7 | < 0.1% | |
| EE WAS EXPOSED TO HEPATITIS A VIRUS WHILE WORKING ON SET. | 5 | < 0.1% | |
| EE WAS EXPOSED TO HEPATITIS A VIRUS WHILE WORKING ON SET | 4 | < 0.1% | |
| EE WAS EXPOSED TO HEPATITIS A VIRUS WHILE WORKING ON SET - P | 3 | < 0.1% | |
| EE INHALED SMOKE FROM A FIRE THAT STARTED WHEN A HOT WIRE WA | 2 | < 0.1% | |
| EMPLOYEE WAS WORKING ON SET WHEN HE EXPERIENCED SHORTNESS OF | 2 | < 0.1% | |
| WHILE WORKING IN WOODED - TALL GRASS AREA - EMPLOYEE WAS BIT | 2 | < 0.1% | |
| Other values (15334) | 15351 | 99.6% |
Length
| Max length | 60 |
|---|---|
| Median length | 60 |
| Mean length | 57.90601675 |
| Min length | 7 |
| Distinct count | 1340 |
|---|---|
| Unique (%) | 8.7% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 120.4 KiB |
| LOS ANGELES | |
|---|---|
| UNKNOWN | 1538 |
| BURBANK | 1506 |
| NEW ORLEANS | 610 |
| NEW YORK | 393 |
| Other values (1335) |
| Value | Count | Frequency (%) | |
| LOS ANGELES | 2294 | 14.9% | |
| UNKNOWN | 1538 | 10.0% | |
| BURBANK | 1506 | 9.8% | |
| NEW ORLEANS | 610 | 4.0% | |
| NEW YORK | 393 | 2.6% | |
| BROOKLYN | 388 | 2.5% | |
| WILMINGTON | 328 | 2.1% | |
| CULVER CITY | 298 | 1.9% | |
| AUSTIN | 261 | 1.7% | |
| ATLANTA | 222 | 1.4% | |
| Other values (1330) | 7568 | 49.1% |
Length
| Max length | 19 |
|---|---|
| Median length | 8 |
| Mean length | 9.119101707 |
| Min length | 1 |
| Distinct count | 1841 |
|---|---|
| Unique (%) | 15.7% |
| Missing | 3684 |
| Missing (%) | 23.9% |
| Memory size | 120.4 KiB |
| 91502 | 1369 |
|---|---|
| 95816 | 406 |
| 90038 | 303 |
| 90001 | 222 |
| 90028 | 219 |
| Other values (1836) |
| Value | Count | Frequency (%) | |
| 91502 | 1369 | 8.9% | |
| 95816 | 406 | 2.6% | |
| 90038 | 303 | 2.0% | |
| 90001 | 222 | 1.4% | |
| 90028 | 219 | 1.4% | |
| 91505 | 199 | 1.3% | |
| 90232 | 171 | 1.1% | |
| 91504 | 167 | 1.1% | |
| 91608 | 158 | 1.0% | |
| 91521 | 127 | 0.8% | |
| Other values (1831) | 8382 | 54.4% | |
| (Missing) | 3684 | 23.9% |
Length
| Max length | 9 |
|---|---|
| Median length | 5 |
| Mean length | 4.513467904 |
| Min length | 3 |
| Distinct count | 53 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 12 |
| Missing (%) | 0.1% |
| Memory size | 120.4 KiB |
| California | |
|---|---|
| New York | 1356 |
| Louisiana | 1132 |
| Georgia | 654 |
| North Carolina | 583 |
| Other values (48) |
| Value | Count | Frequency (%) | |
| California | 8019 | 52.0% | |
| New York | 1356 | 8.8% | |
| Louisiana | 1132 | 7.3% | |
| Georgia | 654 | 4.2% | |
| North Carolina | 583 | 3.8% | |
| Texas | 403 | 2.6% | |
| New Mexico | 383 | 2.5% | |
| Pennsylvania | 311 | 2.0% | |
| Michigan | 274 | 1.8% | |
| Utah | 263 | 1.7% | |
| Other values (43) | 2017 | 13.1% |
Length
| Max length | 25 |
|---|---|
| Median length | 10 |
| Mean length | 9.506003765 |
| Min length | 3 |
jurisdiction
Categorical
| Distinct count | 47 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| California | |
|---|---|
| New York | 1525 |
| Louisiana | 851 |
| Georgia | 565 |
| North Carolina | 535 |
| Other values (42) |
| Value | Count | Frequency (%) | |
| California | 9094 | 59.0% | |
| New York | 1525 | 9.9% | |
| Louisiana | 851 | 5.5% | |
| Georgia | 565 | 3.7% | |
| North Carolina | 535 | 3.5% | |
| Texas | 353 | 2.3% | |
| New Mexico | 332 | 2.2% | |
| Pennsylvania | 244 | 1.6% | |
| Illinois | 241 | 1.6% | |
| Utah | 236 | 1.5% | |
| Other values (37) | 1431 | 9.3% |
Length
| Max length | 20 |
|---|---|
| Median length | 10 |
| Mean length | 9.506523009 |
| Min length | 4 |
lost_time_or_medicalonly
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 11 |
| Missing (%) | 0.1% |
| Memory size | 120.4 KiB |
| Medical Only | |
|---|---|
| Lost Time |
| Value | Count | Frequency (%) | |
| Medical Only | 11684 | 75.8% | |
| Lost Time | 3712 | 24.1% | |
| (Missing) | 11 | 0.1% |
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 11.27078601 |
| Min length | 3 |
| Distinct count | 45 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| Strain | |
|---|---|
| Laceration | |
| Specific Injury - All Other | |
| Contusion | |
| Sprain | |
| Other values (40) |
| Value | Count | Frequency (%) | |
| Strain | 3583 | 23.3% | |
| Laceration | 2418 | 15.7% | |
| Specific Injury - All Other | 2221 | 14.4% | |
| Contusion | 1610 | 10.4% | |
| Sprain | 1025 | 6.7% | |
| Puncture | 785 | 5.1% | |
| Foreign Body | 759 | 4.9% | |
| Inflammation | 670 | 4.3% | |
| Fracture | 570 | 3.7% | |
| Infection | 200 | 1.3% | |
| Other values (35) | 1566 | 10.2% |
Length
| Max length | 59 |
|---|---|
| Median length | 9 |
| Mean length | 11.81729084 |
| Min length | 4 |
| Distinct count | 9 |
|---|---|
| Unique (%) | 1.5% |
| Missing | 14808 |
| Missing (%) | 96.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.9131886477462436 |
|---|---|
| Minimum | 1.0 |
| Maximum | 18.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 18 |
| Range | 17 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.334821653 |
|---|---|
| Coefficient of variation (CV) | 0.6976947384 |
| Kurtosis | 38.79484103 |
| Mean | 1.913188648 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 4.345641318 |
| Sum | 1146 |
| Variance | 1.781748846 |
| Value | Count | Frequency (%) | |
| 1 | 281 | 1.8% | |
| 2 | 187 | 1.2% | |
| 3 | 84 | 0.5% | |
| 4 | 29 | 0.2% | |
| 5 | 8 | 0.1% | |
| 6 | 4 | < 0.1% | |
| 9 | 3 | < 0.1% | |
| 7 | 2 | < 0.1% | |
| 18 | 1 | < 0.1% | |
| (Missing) | 14808 | 96.1% |
| Value | Count | Frequency (%) | |
| 1 | 281 | 1.8% | |
| 2 | 187 | 1.2% | |
| 3 | 84 | 0.5% | |
| 4 | 29 | 0.2% | |
| 5 | 8 | 0.1% |
| Value | Count | Frequency (%) | |
| 18 | 1 | < 0.1% | |
| 9 | 3 | < 0.1% | |
| 7 | 2 | < 0.1% | |
| 6 | 4 | < 0.1% | |
| 5 | 8 | 0.1% |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 2 |
| Missing (%) | < 0.1% |
| Memory size | 120.4 KiB |
| Injury | |
|---|---|
| Skin disorder | 156 |
| Respiratory condition | 60 |
| Poisoning | 27 |
| Hearing loss | 26 |
| Value | Count | Frequency (%) | |
| Injury | 15128 | 98.2% | |
| Skin disorder | 156 | 1.0% | |
| Respiratory condition | 60 | 0.4% | |
| Poisoning | 27 | 0.2% | |
| Hearing loss | 26 | 0.2% | |
| All other illnesses | 8 | 0.1% | |
| (Missing) | 2 | < 0.1% |
Length
| Max length | 21 |
|---|---|
| Median length | 6 |
| Mean length | 6.151035244 |
| Min length | 3 |
| Distinct count | 10 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 344 |
| Missing (%) | 2.2% |
| Memory size | 120.4 KiB |
| No Serious Injury Indicated | |
|---|---|
| Fatality | 19 |
| Fractured Bone(s) | 15 |
| Back Injury involving Surgery/Extended Disability | 10 |
| Involves AIDS, Herpes, TSS, Cancer, Other Diseases | 5 |
| Other values (5) | 14 |
| Value | Count | Frequency (%) | |
| No Serious Injury Indicated | 15000 | 97.4% | |
| Fatality | 19 | 0.1% | |
| Fractured Bone(s) | 15 | 0.1% | |
| Back Injury involving Surgery/Extended Disability | 10 | 0.1% | |
| Involves AIDS, Herpes, TSS, Cancer, Other Diseases | 5 | < 0.1% | |
| Serious Head Injury | 4 | < 0.1% | |
| Heart Attack or Cardio-Vascular Accident | 4 | < 0.1% | |
| Minor Amputation | 2 | < 0.1% | |
| Serious Burns | 2 | < 0.1% | |
| Major Amputation | 2 | < 0.1% | |
| (Missing) | 344 | 2.2% |
Length
| Max length | 50 |
|---|---|
| Median length | 27 |
| Mean length | 26.44934121 |
| Min length | 3 |
| Distinct count | 581 |
|---|---|
| Unique (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1010.9847471928344 |
|---|---|
| Minimum | 0 |
| Maximum | 2359 |
| Zeros | 2828 |
| Zeros (%) | 18.4% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 615 |
| median | 1035 |
| Q3 | 1510 |
| 95-th percentile | 2025 |
| Maximum | 2359 |
| Range | 2359 |
| Interquartile range (IQR) | 895 |
Descriptive statistics
| Standard deviation | 660.7386831 |
|---|---|
| Coefficient of variation (CV) | 0.6535594973 |
| Kurtosis | -0.9523935227 |
| Mean | 1010.984747 |
| Median Absolute Deviation (MAD) | 465 |
| Skewness | -0.1933581149 |
| Sum | 15576242 |
| Variance | 436575.6074 |
| Value | Count | Frequency (%) | |
| 0 | 2828 | 18.4% | |
| 1000 | 661 | 4.3% | |
| 1400 | 538 | 3.5% | |
| 1100 | 530 | 3.4% | |
| 1500 | 470 | 3.1% | |
| 900 | 407 | 2.6% | |
| 1600 | 401 | 2.6% | |
| 1300 | 386 | 2.5% | |
| 800 | 384 | 2.5% | |
| 1700 | 300 | 1.9% | |
| Other values (571) | 8502 | 55.2% |
| Value | Count | Frequency (%) | |
| 0 | 2828 | 18.4% | |
| 1 | 2 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 15 | 0.1% |
| Value | Count | Frequency (%) | |
| 2359 | 2 | < 0.1% | |
| 2355 | 7 | < 0.1% | |
| 2353 | 1 | < 0.1% | |
| 2350 | 7 | < 0.1% | |
| 2348 | 1 | < 0.1% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 53 |
| Missing (%) | 0.3% |
| Memory size | 120.4 KiB |
| Specific Injury | |
|---|---|
| Cumulative Trauma | 159 |
| Value | Count | Frequency (%) | |
| Specific Injury | 15195 | 98.6% | |
| Cumulative Trauma | 159 | 1.0% | |
| (Missing) | 53 | 0.3% |
Length
| Max length | 17 |
|---|---|
| Median length | 15 |
| Mean length | 14.97936003 |
| Min length | 3 |
policy_yr
Real number (ℝ≥0)
| Distinct count | 15 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2007.6086194586876 |
|---|---|
| Minimum | 2000 |
| Maximum | 2014 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 2000 |
|---|---|
| 5-th percentile | 2000 |
| Q1 | 2004 |
| median | 2008 |
| Q3 | 2011 |
| 95-th percentile | 2014 |
| Maximum | 2014 |
| Range | 14 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 4.268828711 |
|---|---|
| Coefficient of variation (CV) | 0.002126325156 |
| Kurtosis | -1.203913341 |
| Mean | 2007.608619 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.1645031735 |
| Sum | 30931226 |
| Variance | 18.22289857 |
| Value | Count | Frequency (%) | |
| 2011 | 1440 | 9.3% | |
| 2013 | 1325 | 8.6% | |
| 2012 | 1196 | 7.8% | |
| 2005 | 1141 | 7.4% | |
| 2014 | 1123 | 7.3% | |
| 2010 | 1095 | 7.1% | |
| 2004 | 1029 | 6.7% | |
| 2008 | 969 | 6.3% | |
| 2002 | 950 | 6.2% | |
| 2009 | 932 | 6.0% | |
| Other values (5) | 4207 | 27.3% |
| Value | Count | Frequency (%) | |
| 2000 | 778 | 5.0% | |
| 2001 | 697 | 4.5% | |
| 2002 | 950 | 6.2% | |
| 2003 | 908 | 5.9% | |
| 2004 | 1029 | 6.7% |
| Value | Count | Frequency (%) | |
| 2014 | 1123 | 7.3% | |
| 2013 | 1325 | 8.6% | |
| 2012 | 1196 | 7.8% | |
| 2011 | 1440 | 9.3% | |
| 2010 | 1095 | 7.1% |
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 6313 |
| Missing (%) | 41.0% |
| Memory size | 120.4 KiB |
| California Refom 1 | |
|---|---|
| California Refom 0 | |
| California Reform 2 |
| Value | Count | Frequency (%) | |
| California Refom 1 | 4983 | 32.3% | |
| California Refom 0 | 3022 | 19.6% | |
| California Reform 2 | 1089 | 7.1% | |
| (Missing) | 6313 | 41.0% |
Length
| Max length | 19 |
|---|---|
| Median length | 18 |
| Mean length | 11.92444993 |
| Min length | 3 |
| Distinct count | 46 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 901 |
| Missing (%) | 5.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.692954639459534 |
|---|---|
| Minimum | 1.0 |
| Maximum | 60.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 11 |
| 95-th percentile | 15 |
| Maximum | 60 |
| Range | 59 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 4.677775864 |
|---|---|
| Coefficient of variation (CV) | 0.6080597226 |
| Kurtosis | 6.983011565 |
| Mean | 7.692954639 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.152591289 |
| Sum | 111594 |
| Variance | 21.88158704 |
| Value | Count | Frequency (%) | |
| 4 | 1352 | 8.8% | |
| 2 | 1239 | 8.0% | |
| 3 | 1127 | 7.3% | |
| 11 | 1082 | 7.0% | |
| 5 | 1061 | 6.9% | |
| 10 | 1048 | 6.8% | |
| 7 | 933 | 6.1% | |
| 13 | 920 | 6.0% | |
| 12 | 905 | 5.9% | |
| 6 | 883 | 5.7% | |
| Other values (36) | 3956 | 25.7% | |
| (Missing) | 901 | 5.8% |
| Value | Count | Frequency (%) | |
| 1 | 869 | 5.6% | |
| 2 | 1239 | 8.0% | |
| 3 | 1127 | 7.3% | |
| 4 | 1352 | 8.8% | |
| 5 | 1061 | 6.9% |
| Value | Count | Frequency (%) | |
| 60 | 1 | < 0.1% | |
| 55 | 1 | < 0.1% | |
| 54 | 1 | < 0.1% | |
| 52 | 1 | < 0.1% | |
| 51 | 3 | < 0.1% |
| Distinct count | 247 |
|---|---|
| Unique (%) | 1.6% |
| Missing | 146 |
| Missing (%) | 0.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.929559006618177 |
|---|---|
| Minimum | -1094.0 |
| Maximum | 1994.0 |
| Zeros | 3225 |
| Zeros (%) | 20.9% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | -1094 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 5 |
| 95-th percentile | 27 |
| Maximum | 1994 |
| Range | 3088 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 44.78888828 |
|---|---|
| Coefficient of variation (CV) | 5.648345418 |
| Kurtosis | 656.5273028 |
| Mean | 7.929559007 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 18.61863324 |
| Sum | 121013 |
| Variance | 2006.044513 |
| Value | Count | Frequency (%) | |
| 1 | 3792 | 24.6% | |
| 0 | 3225 | 20.9% | |
| 2 | 1579 | 10.2% | |
| 3 | 1379 | 9.0% | |
| 4 | 1049 | 6.8% | |
| 5 | 711 | 4.6% | |
| 6 | 576 | 3.7% | |
| 7 | 452 | 2.9% | |
| 8 | 288 | 1.9% | |
| 9 | 167 | 1.1% | |
| Other values (237) | 2043 | 13.3% | |
| (Missing) | 146 | 0.9% |
| Value | Count | Frequency (%) | |
| -1094 | 1 | < 0.1% | |
| -693 | 1 | < 0.1% | |
| -363 | 1 | < 0.1% | |
| -362 | 1 | < 0.1% | |
| -346 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1994 | 1 | < 0.1% | |
| 1795 | 1 | < 0.1% | |
| 1245 | 1 | < 0.1% | |
| 1213 | 1 | < 0.1% | |
| 1141 | 2 | < 0.1% |
| Distinct count | 415 |
|---|---|
| Unique (%) | 2.7% |
| Missing | 146 |
| Missing (%) | 0.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19.966253849682197 |
|---|---|
| Minimum | -31.0 |
| Maximum | 4200.0 |
| Zeros | 11196 |
| Zeros (%) | 72.7% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | -31 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 20 |
| Maximum | 4200 |
| Range | 4231 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 163.6920873 |
|---|---|
| Coefficient of variation (CV) | 8.198437651 |
| Kurtosis | 206.7916229 |
| Mean | 19.96625385 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 12.992533 |
| Sum | 304705 |
| Variance | 26795.09945 |
| Value | Count | Frequency (%) | |
| 0 | 11196 | 72.7% | |
| 1 | 1381 | 9.0% | |
| 2 | 438 | 2.8% | |
| 3 | 356 | 2.3% | |
| 4 | 245 | 1.6% | |
| 5 | 163 | 1.1% | |
| 7 | 126 | 0.8% | |
| 6 | 119 | 0.8% | |
| 8 | 64 | 0.4% | |
| 9 | 60 | 0.4% | |
| Other values (405) | 1113 | 7.2% | |
| (Missing) | 146 | 0.9% |
| Value | Count | Frequency (%) | |
| -31 | 1 | < 0.1% | |
| -10 | 1 | < 0.1% | |
| -5 | 1 | < 0.1% | |
| -3 | 1 | < 0.1% | |
| -2 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4200 | 1 | < 0.1% | |
| 3889 | 1 | < 0.1% | |
| 3765 | 1 | < 0.1% | |
| 3524 | 1 | < 0.1% | |
| 3334 | 1 | < 0.1% |
shift
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 120.4 KiB |
| 2nd | |
|---|---|
| 1st | |
| 3rd |
| Value | Count | Frequency (%) | |
| 2nd | 7084 | 46.0% | |
| 1st | 6084 | 39.5% | |
| 3rd | 2239 | 14.5% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
length_how_injury
Real number (ℝ≥0)
| Distinct count | 51 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 57.9060167456351 |
|---|---|
| Minimum | 7 |
| Maximum | 60 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 120.4 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 45 |
| Q1 | 59 |
| median | 60 |
| Q3 | 60 |
| 95-th percentile | 60 |
| Maximum | 60 |
| Range | 53 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 5.702373679 |
|---|---|
| Coefficient of variation (CV) | 0.09847635875 |
| Kurtosis | 17.49582622 |
| Mean | 57.90601675 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.880200295 |
| Sum | 892158 |
| Variance | 32.51706557 |
| Value | Count | Frequency (%) | |
| 60 | 10357 | 67.2% | |
| 59 | 2528 | 16.4% | |
| 58 | 238 | 1.5% | |
| 57 | 225 | 1.5% | |
| 56 | 167 | 1.1% | |
| 55 | 163 | 1.1% | |
| 54 | 137 | 0.9% | |
| 53 | 135 | 0.9% | |
| 51 | 133 | 0.9% | |
| 49 | 114 | 0.7% | |
| Other values (41) | 1210 | 7.9% |
| Value | Count | Frequency (%) | |
| 7 | 7 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 10 | 1 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 12 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 60 | 10357 | 67.2% | |
| 59 | 2528 | 16.4% | |
| 58 | 238 | 1.5% | |
| 57 | 225 | 1.5% | |
| 56 | 167 | 1.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Dependent | ave_wkly_wage | body_part | cause | claimant_age | atty_firm_name | gender | marital_status | claim_st | depart_code | detail_cause | domestic_foreign | employ_status | handling_office | how_injury_occur | injury_city | injury_postal | injury_state | jurisdiction | lost_time_or_medicalonly | nature_injury | #dependents | osha_injury_type | severity_index | time_injury | type_loss | policy_yr | reforms_dummy | length_employed | diff_carrier_employer | diff_employer_injury | shift | length_how_injury | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 98679.0 | 500.0 | Pelvis | Struck or Injured By | 21.0 | NaN | Female | NaN | California | NaN | Struck by Falling/Flying Object | Domestic | Full-Time | LOS ANGELE | GOING DOWN SKI HILL ON AN AIR MATTRESS AS PART OF A SKIT, | PORTLAND | 97201 | Oregon | California | Lost Time | Fracture | NaN | Injury | Fractured Bone(s) | 1430 | Specific Injury | 2001 | California Refom 0 | 14.0 | 4.0 | 0.0 | 2nd | 57 |
| 1 | 55727.0 | 1037.0 | Low Back Area | Strain or Injury By | NaN | NaN | Male | NaN | California | NaN | Strain/Injury by Lifting | Domestic | Full-Time | IN-STATE A | REPETITIVE LIFTING OF CABLE, EE STRAINED LOWER BACK | PEARL HARBOR | NaN | Hawaii | Hawaii | Lost Time | Strain | 1.0 | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2001 | NaN | 14.0 | 1.0 | 2.0 | 1st | 51 |
| 2 | 185833.0 | 929.0 | Low Back Area | Strain or Injury By | 63.0 | TROBINSON & CHUR ATTORNEYS AT LAW | Male | Married | Hawaii | NaN | Strain/Injury by Carrying | Domestic | Unknown/Other | IN-STATE A | STRAINED LOW BACK MOVING A PALM TREE WITH CO-WORKER | PEARL HARBOR | 96860 | Hawaii | Hawaii | Lost Time | Strain | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2001 | NaN | 14.0 | 9.0 | 0.0 | 1st | 51 |
| 3 | 98615.0 | 1226.0 | Multiple Body Parts | Strain or Injury By | 49.0 | IBARRY STEVENS;;M;; | Male | NaN | Idaho | NaN | Strain/Injury by Repetitive Motion | Domestic | Unknown/Other | LOS ANGELE | EE CLAIMS: CT 1973 -/02 TO BOTH SHOULDERS - SPINE - AND EAR | BURBANK | 91502 | California | California | Lost Time | Specific Injury - All Other | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2002 | California Refom 0 | NaN | 8.0 | 632.0 | 1st | 60 |
| 4 | 51396.0 | NaN | Other Facial Soft Tissue | Miscellaneous Causes | 51.0 | IBARRY STEVENS;;M;; | Male | NaN | Idaho | NaN | Misc, No Physical Cause | Domestic | Unknown/Other | LOS ANGELE | EMPLOYEE CLAIMS: CT 11/16/93 -8/1/03; PHYSICAL STRESS AND ST | BURBANK | 91502 | California | California | Lost Time | Specific Injury - All Other | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2003 | California Refom 0 | NaN | 1.0 | 200.0 | 1st | 60 |
| 5 | 4079.0 | NaN | Multiple Upper Extremities | Strain or Injury By | 55.0 | IWAX & WAX;;A;; | Male | NaN | California | NaN | Strain/Injury by Repetitive Motion | Domestic | Unknown/Other | LOS ANGELE | EE CLAIMS: CT 1980 -NOV 2003 TO NECK - BACK - BOTH UPPER EXT | BURBANK | 91502 | California | California | Lost Time | Carpal Tunnel Syndrome | NaN | Injury | No Serious Injury Indicated | 0 | Cumulative Trauma | 2002 | California Refom 0 | NaN | 0.0 | 444.0 | 1st | 60 |
| 6 | 1909.0 | 1129.0 | Low Back Area | Strain or Injury By | 49.0 | TCRAIG RICHLIN | Male | NaN | California | NaN | Strain/Injury by Misc | Domestic | Unknown/Other | LOS ANGELE | EE CLAIMS CT 09/25/00 09/25/01 TO LEFT LOWER EXTREMITY AND B | UNKNOWN | NaN | California | California | Lost Time | Specific Injury - All Other | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2000 | California Refom 0 | 15.0 | 3.0 | 1254.0 | 1st | 60 |
| 7 | 6687.0 | NaN | Low Back Area | Strain or Injury By | 36.0 | NaN | Male | NaN | California | NaN | Strain/Injury by Repetitive Motion | Domestic | Unknown/Other | LOS ANGELE | EE CLAIMS REPETITIVE BACK INJURY CT 8/1/99-11/03 FROM REPETI | MISSION HILLS | 91345 | California | California | Lost Time | Strain | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2001 | California Refom 0 | NaN | 10.0 | 1223.0 | 1st | 60 |
| 8 | 5352.0 | NaN | Low Back Area | Strain or Injury By | 45.0 | NaN | Male | NaN | California | NaN | Strain/Injury by Repetitive Motion | Domestic | Unknown/Other | LOS ANGELE | EE CLAIMS; CT 4/23/02-4/23/03 TO SPINE. BILATERAL UPPER EXTR | BURBANK | 91504 | California | California | Lost Time | Strain | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2003 | California Refom 0 | NaN | 6.0 | 436.0 | 1st | 60 |
| 9 | 6324.0 | 1000.0 | Lower Leg | Strain or Injury By | 45.0 | NaN | Male | NaN | California | NaN | Strain/Injury by Repetitive Motion | Domestic | Full-Time | LOS ANGELE | EE CLAIMS: L LEG - VENUS THROMBOSIS WHILE WORKING IN LAS VEG | LAS VEGAS | 89104 | Nevada | California | Lost Time | Strain | NaN | Injury | No Serious Injury Indicated | 0 | Specific Injury | 2002 | California Refom 0 | 13.0 | 2.0 | 642.0 | 1st | 60 |
Last rows
| Dependent | ave_wkly_wage | body_part | cause | claimant_age | atty_firm_name | gender | marital_status | claim_st | depart_code | detail_cause | domestic_foreign | employ_status | handling_office | how_injury_occur | injury_city | injury_postal | injury_state | jurisdiction | lost_time_or_medicalonly | nature_injury | #dependents | osha_injury_type | severity_index | time_injury | type_loss | policy_yr | reforms_dummy | length_employed | diff_carrier_employer | diff_employer_injury | shift | length_how_injury | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15397 | 264.0 | NaN | Abdomen including Groin | Strain or Injury By | 38.0 | NaN | Male | NaN | Virginia | 18.0 | Strain/Injury by Lifting | Domestic | Unknown/Other | WC SOUTHEA | PATIENT WAS LIFTING LENS CASES, STATES SOMETHING DID NOT FEE | RICHMOND | 23222 | Virginia | Virginia | Medical Only | Strain | NaN | Injury | No Serious Injury Indicated | 830 | Specific Injury | 2014 | NaN | 1.0 | 0.0 | 0.0 | 1st | 60 |
| 15398 | 1034.0 | NaN | Brain | Struck or Injured By | 48.0 | NaN | Male | NaN | Florida | 20.0 | Struck by Motor Vehicle | Domestic | Unknown/Other | WC SOUTHEA | EE WAS PERFORMING A STUNT WHEN HE GOT HIT BY A PICTURE CAR A | STONE MOUNTAIN | 30087 | Georgia | Florida | Medical Only | Concussion | NaN | Injury | No Serious Injury Indicated | 2230 | Specific Injury | 2014 | NaN | 1.0 | NaN | NaN | 3rd | 60 |
| 15399 | 926.0 | NaN | Lower Leg | Cut, Puncture, Scrape Injured By | 60.0 | NaN | Male | NaN | Arizona | 19.0 | Cut/Puncture/Scrape, Object Lift/Handled | Domestic | Unknown/Other | WC SOUTHEA | EE STATES WHILE MOVING STEEL HE LACERATED HIS R UPPER ASPECT | FAYETTEVILLE | 30214 | Georgia | Georgia | Medical Only | Infection | NaN | Injury | No Serious Injury Indicated | 1100 | Specific Injury | 2014 | NaN | 1.0 | 13.0 | 5.0 | 2nd | 60 |
| 15400 | 780.0 | NaN | Shoulder(s) | Strain or Injury By | 39.0 | NaN | Female | NaN | Georgia | 8.0 | Strain/Injury by Carrying | Domestic | Unknown/Other | WC SOUTHEA | WHILE LOADING THE TRAILER EE FELT DISCOMFORT IN R SHOULDER. | SENOIA | 30276 | Georgia | Georgia | Medical Only | Strain | NaN | Injury | No Serious Injury Indicated | 1818 | Specific Injury | 2014 | NaN | 1.0 | 2.0 | 0.0 | 3rd | 59 |
| 15401 | 0.0 | NaN | Hand | Cut, Puncture, Scrape Injured By | 44.0 | NaN | Male | NaN | North Carolina | 8.0 | Cut/Puncture/Scrape, Hand Tool | Domestic | Unknown/Other | WC SOUTHEA | EMPLOYEE WAS CUTTING ZIP TIES WITH A KNIFE WHEN IT SLIPPED O | WILMINGTON | 28401 | North Carolina | North Carolina | Medical Only | Laceration | NaN | Injury | No Serious Injury Indicated | 830 | Specific Injury | 2014 | NaN | 1.0 | 0.0 | 0.0 | 1st | 60 |
| 15402 | 2405.0 | NaN | Knee | Fall, Slip or Trip Injury | 21.0 | NaN | Female | NaN | Georgia | 6.0 | Fall/Slip, Into Opening | Domestic | Unknown/Other | WC SOUTHEA | EMPLOYEE WAS RETRIEVING PAINT SUPPLIES FROM A TRAILER WHEN S | FAYETTEVILLE | 30214 | Georgia | Georgia | Medical Only | Inflammation | NaN | Injury | No Serious Injury Indicated | 1115 | Specific Injury | 2014 | NaN | 1.0 | 0.0 | 0.0 | 2nd | 60 |
| 15403 | 1807.0 | 6486.0 | Shoulder(s) | Strain or Injury By | 33.0 | NaN | Male | NaN | Georgia | 20.0 | Strain/Injury by Misc | Domestic | Full-Time | WC SOUTHEA | PATIENT WAS IN A SCENE THAT REQUIRED HIM TO RUN, STOP AND FA | PEACHTREE CITY | NaN | Georgia | Georgia | Lost Time | Strain | NaN | Injury | No Serious Injury Indicated | 1558 | Specific Injury | 2014 | NaN | 1.0 | 0.0 | 0.0 | 2nd | 60 |
| 15404 | 0.0 | NaN | Ankle | Strain or Injury By | 33.0 | NaN | Male | Unmarried, Single, Widowed, Divorced | Virginia | 22.0 | Strain/Injury by Jumping | Domestic | Unknown/Other | WC SOUTHEA | PATIENT JUMPED OVER A 4' WOODEN FENCE, LANDED ON UNEVEN GROU | HENRICO | 23238 | Virginia | Virginia | Medical Only | Sprain | NaN | Injury | No Serious Injury Indicated | 1900 | Specific Injury | 2014 | NaN | 1.0 | 1.0 | 0.0 | 3rd | 60 |
| 15405 | 507.0 | NaN | Arm | Strain or Injury By | 34.0 | NaN | Male | NaN | Georgia | 3.0 | Strain/Injury by Misc | Domestic | Unknown/Other | WC SOUTHEA | PATIENT STATED HE WAS INJURED WHEN PLASTERING WALLS. PRODUCT | HIRAM | 30141 | Georgia | Georgia | Medical Only | Strain | NaN | Injury | No Serious Injury Indicated | 1041 | Specific Injury | 2014 | NaN | 1.0 | 1.0 | 0.0 | 2nd | 60 |
| 15406 | 0.0 | NaN | Low Back Area | Strain or Injury By | 49.0 | NaN | Female | NaN | Missouri | 11.0 | Strain/Injury by Twisting | Domestic | Unknown/Other | WC SOUTHEA | WHILE PERFORMING REQUIRED JOB DUTIES, EE TWISTED HER BACK, A | NASHVILLE | 37214 | Tennessee | South Carolina | Medical Only | Strain | NaN | Injury | No Serious Injury Indicated | 1930 | Specific Injury | 2014 | NaN | 1.0 | 0.0 | 3.0 | 3rd | 60 |